Accurate and effective latent concept modeling for ad hoc information retrieval
نویسندگان
چکیده
A keyword query is the representation of the information need of a user, and is the result of a complex cognitive process which often results in under-specification. We propose an unsupervised method namely Latent Concept Modeling (LCM) for mining and modeling latent search concepts in order to recreate the conceptual view of the original information need. We use Latent Dirichlet Allocation (LDA) to exhibit highly-specific query-related topics from pseudo-relevant feedback documents. We define these topics as the latent concepts of the user query. We perform a thorough evaluation of our approach over two large ad-hoc TREC collections. Our findings reveal that the proposed method accurately models latent concepts, while being very effective in a query expansion retrieval setting. RÉSUMÉ. Une requête est la représentation du besoin d’information d’un utilisateur, et est le résultat d’un processus cognitif complexe qui mène souvent à un mauvais choix de mots-clés. Nous proposons une méthode non supervisée pour la modélisation de concepts implicites d’une requête, dans le but de recréer la représentation conceptuelle du besoin d’information initial. Nous utilisons l’allocation de Dirichlet latente (LDA) pour détecter les concepts implicites de la requête en utilisant des documents pseudo-pertinents. Nous évaluons cette méthode en profondeur en utilisant deux collections de test de TREC. Nous trouvons notamment que notre approche permet de modéliser précisément les concepts implicites de la requête, tout en obtenant de bonnes performances dans le cadre d’une recherche de documents.
منابع مشابه
Are Semantically Coherent Topic Models Useful for Ad Hoc Information Retrieval?
The current topic modeling approaches for Information Retrieval do not allow to explicitly model query-oriented latent topics. More, the semantic coherence of the topics has never been considered in this field. We propose a model-based feedback approach that learns Latent Dirichlet Allocation topic models on the top-ranked pseudo-relevant feedback, and we measure the semantic coherence of those...
متن کاملModeling Search Engine ’ s Explorations in Dynamic Search : An Ontological Perspective
Dynamic search is an information retrieval task, in which information systems retrieve documents for a user’s multiple queries. Each query starts a search iteration and aims to fulfill part of the user’s information need. Modeling search engine’s explorations in dynamic search serves to help search engines explore in the information space, retrieve relevant documents and fulfill the user’s info...
متن کاملModeling Term Associations for Ad-Hoc Retrieval Performance Within Language Modeling Framework
Previous research has shown that using term associations could improve the effectiveness of information retrieval (IR) systems. However, most of the existing approaches focus on query reformulation. Document reformulation has just begun to be studied recently. In this paper, we study how to utilize term association measures to do document modeling, and what types of measures are effective in do...
متن کاملA Latent Variable Graphical Model Derivation of Diversity for Set-based Retrieval
Diversity has been heavily motivated as an objective criterion for result sets in the information retrieval literature and various ad-hoc heuristics have been proposed to explicitly optimize for it. In this paper, we start from first principles and show that optimizing a simple criterion of set-based relevance in a latent variable graphical model — a framework we refer to as probabilistic laten...
متن کاملImproving Quality of Service Routing in Mobile Ad Hoc Networks Using OLSR
Mobile ad hoc networks (MANET) are constructed by mobile nodes without access point. Since MANET has certain constraints, including power shortages, an unstable wireless environment and node mobility, more power-efficient and reliable routing protocols are needed. The OLSR protocol is an optimization of the classical link state algorithm. OLSR introduces an interesting concept, the multipoint r...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Document Numérique
دوره 17 شماره
صفحات -
تاریخ انتشار 2014